OPEN 3 ACCESS Freely available online 



•0-PLOS I o-^E 



Two Enhancers Control Transcription of Drosophila (Wti 

muscleblind \x\ the Embryonic Somatic Musculature and cro-Mark 
in the Central Nervous System 

Ariadna Bargiela^'^', Beatriz Llamusi^'^^ Estefani'a Cerro-Herreros^'^, Ruben Artero^'^* 

1 Translational Genomics Group, Department of Genetics, University of Valencia, Valencia, Spain, 2 INCLIVA Health Research Institute, Valencia, Spain 

Abstract 

The phylogenetically conserved family of Muscleblind proteins are RNA-blndIng factors involved in a variety of gene 
expression processes including alternative splicing regulation, RNA stability and subcellular localization, and miRNA 
biogenesis, which typically contribute to cell-type specific differentiation. In humans, sequestration of Muscleblind-like 
proteins IVIBNLI and MBNL2 has been implicated in degenerative disorders, particularly expansion diseases such as 
myotonic dystrophy type 1 and 2. Drosophila muscleblind was previously shown to be expressed in embryonic somatic and 
visceral muscle subtypes, and in the central nervous system, and to depend on Mef2 for transcriptional activation. Genomic 
approaches have pointed out candidate gene promoters and tissue-specific enhancers, but experimental confirmation of 
their regulatory roles was lacking. In our study, luciferase reporter assays in S2 cells confirmed that regions PI (515 bp) and 
P2 (573 bp), involving the beginning of exon 1 and exon 2, respectively, were able to initiate RNA transcription. Similarly, 
transgenic Drosophila embryos carrying enhancer reporter constructs supported the existence of two regulatory regions 
which control embryonic expression of muscleblind in the central nerve cord (NE, neural enhancer; 830 bp) and somatic 
(skeletal) musculature (ME, muscle enhancer; 3.3 kb). Both NE and ME were able to boost expression from the Hsp70 
heterologous promoter. In S2 cell assays most of the ME enhancer activation could be further narrowed down to a 1200 bp 
subregion (ME.3), which contains predicted binding sites for the Mef2 transcription factor. The present study constitutes the 
first characterization of muscleblind enhancers and will contribute to a deeper understanding of the transcriptional 
regulation of the gene. 
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Introduction 

Muscleblind protein.s were initially identified in Drosophila and 
associated to the development of the embryonic peripheral 
nervous system [1], the muscles [2] and the adult photoreceptors 
[3]. They were later found to regulate alternative splicing of 
defined pre-mRNAs by binding to specific consensus sequences 
and to hairpins containing pyrimidine mismatches through 
conserved zinc finger motifs of the CCCH type ([4,5] and 
reviewed in [6]). Muscleblind target transcripts encode cell 
adhesion and cytoskeleton components, proteins involved in 
muscle excitation and contraction, structural proteins in muscle 
sarcomere and signalling molecules, among others [5,7,8,9,10,1 1]. 
Through alternative splicing muscleblind transcripts themselves 
generate at least fourteen transcript isoforms. Most of them share 
common 5' sequences but differ at the 3 '-ends, encoding proteins 
of difierent lengths and carboxyl termini. The musclehlind 
transcriptional unit is large and has a complex organization with 



ten exons distributed over about thirty times more than the 
average gene length in Drosophila [5,12]. 

In contrast to Drosophila, which has a single gene, three 
Muscleblind-like homologs {MBNLl, MBNL2, and MBML3) exist 
in humans and mice [13,14]. Although recent results have 
highlighted MBNL proteins as regulators of messenger RNA 
(mRNA) stabihty [15,16,17], localization [10,18] or miRNA 
biogenesis [19] in the cytoplasm, these proteins are particularly 
well-known for their nuclear function as alternative splicing 
regulators. MBNLl plays a primary role in alternative splicing 
allowing the fetal-to-adult splicing transitions needed for develop- 
ment of skeletal and cardiac muscle whereas MBNL2 seems to 
perform a similar function in the central nervous system [20,21]. 
Similarly, MBNL 1 and MBNL2 are direct negative regulators of a 
large program of cassette exon alternative splicing events that are 
differentially regulated between embryonic stem cells and other 
cell types [22]. In contrast, MBNL3 has been reported as a 
member of the family with unusual functions. MBNL3 antagonizes 
muscle differentiation by promoting exclusion of the alternatively 
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spliced P-exon of Myocyte enhancer/actor 2D [Mef2D) [23] and also by 
the inhibition of myogenesis by maintaining myoblasts in a 
proliferative state [24,25]. As a result of this regulation a negative 
correlation exists between MBNLl and MBNL3 expression levels 
in muscle during development when MBNL3 is mainly detected 
during embryonic development, but also transiently during injury- 
induced adult skeletal muscle regeneration [13,26]. MBNLl and 
MBNL2 have a similar expression pattern in skeletal and heart 
muscle, kidney, liver, lung, intestine, brain and placenta. However, 
MBNLl expression in skeletal muscle is higher than MBNL2 
[13,25]. 

Drosophila Muscleblind shows tissue-specific expression during 
development. In eye-antennal imaginal discs Muscleblind is 
required for the formation of photoreceptor rhabdomeres, 
identifying muscleblind as a general factor required for terminal 
differentiation of adult ommatidia [3]. Its expression was also 
reported in the embryonic central nervous system and in the 
somatic muscles, where disruption of muscleblind caused defects on 
muscle attachments to the epidermis and disrupted Z-band 
formation in muscle sarcomeres [2]. Recent studies have revealed 
a role for muscleblind in the myoblast fusion process through a 
splice-independent regulation of muscle protein 20 [Mp20), a gene 
that promotes myoblast fusion [11]. Consistent with its function 
during terminal muscle diflFerentiation, Drosophila Myocyte en- 
hancer factor 2 (Mef2) activates muscleblind in embryos, placing this 
gene downstream o{ MeJ2 function in the myogenic differentiation 
program in flies [2]. The muscleblind chaste mutation has revealed 
that the gene is not only required during embryo development but 
also in adult brain, where it is necessary for the normal 
development of neural circuitry that regulate female sexual 
receptivity [27]. 

Muscleblind-like proteins are critically involved in many 
pathogenesis pathways, but most notably in myotonic dystrophies 
type 1 and type 2 (DM1 and DM2; reviewed in [6,28]). DM1 is 
caused by the expansion of the unstable CTG triplet in the 
3 'untranslated region of the Dystrophia Myotonica Protein Kinase 
{DMPK) gene [29]. DM2 patients carry an unstable CCTG repeat 
expansion in intron 1 of CCCH-type zinc finger nucleic acid binding 
protein [CNBP] [30]. In both cases, tianscribed repeat expansions 
form ribonuclear foci that have the ability to sequester, among 
others, MBNL proteins, which are therefore depleted from their 
normal functions [14,31,32]. DM1 and DM2 are typically 
regarded as muscular diseases but many other organs are also 
affected resulting in eye cataracts, cognitive dysfunction and 
cardiac conduction defects. 

Despite biomedical and developmental relevance, the knowl- 
edge on the transcriptional regulation of muscleblind genes, 
particularly in Drosophila and in humans, is extremely limited. 
With the aim to fill this gap, in this study we have performed in 
silico and in vivo analyses to define gene promoters and tissue- 
specific m-regulatory regions that control Drosophila muscleblind 
expression. Using a candidate approach, we have identified two 
putative gene promoters, located in exon 1 and exon 2, and have 
confirmed two intronic regions with the ability to drive expression 
to embryonic somatic muscle and the nerve cord. This constitutes 
the first description of tissue-specific enhancers and provides new 
insights into the muscleblind gene. 

Results 

L Mapping of the Promoter Regions of Drosophila 
muscleblind 

The analysis of available cDNA sequences and expressed 
sequence tags (EST) involving the muscleblind locus showed that 5'- 



end sequences clustered to two locations in the gene, to the 
beginning of exon 1 and to the beginning of exon 2 (Fig. IB), 
which suggests that muscleblind might have two transcription start 
sites (TSS). To test this hypothesis we defined two regions as 
potential promoters oi muscleblind. PI ranged from —180 to +335 
(515 bp long) while P2 spanned from -243 to +343 (586 bp long) 
(Fig. 1^), defining as +1 the first bp in exon 1 and exon 2, 
respectively. Although core promoter regions are typically defined 
as +50 to —50 of transcription start site [33], a longer region was 
used to include not only the core promoter but also proximal 
promoter sequences with potential activator binding sites. 

A SCOPE database analysis of Drosophila promoter elements 
[34] confirmed the accumulation of known consensus sequences in 
PI and P2, thus supporting their potential as promoter regions 
(Fig. 1 C,D). Furthermore, these motifs were phylogeneticaUy 
conserved among Drosophila species in the Multiz Alignments & 
phastCons Scores provided by the UCSC, supporting the 
relevance of these non-coding regions (data not shown). To test 
the functional relevance of the putative promoters we generated 
reporter constructs in which P 1 and P2 drove expression of Firefly 
luciferase. In addition, we also tested the activity of shorter 
versions, contained in the longer ones, of 220 bp long (— 104 to + 
1 16; Pl.l) and 397 bp long (-81 to +316; P2.1), respectively, in an 
attempt to define minimal promoters. Dual luciferase reporter 
assays in Drosophila S2 cells transfected with the resulting constructs 
revealed that PI was able to boost luciferase readings more than 
100 fold higher relative to the promoter-less control. This was 2.5 
fold the transcription measured for Pl.l, 12 fold higher than the 
luciferase activity driven by P2 and 7 times the activity measured 
for P2.1 (Fig. IE). Thus, robust expression of luciferase was 
observed in PI constructs in comparison to reporter expression 
driven by P2, and the higher activity obtained with PI in 
comparison to Pl.l suggests that P 1 contains proximal promoter 
elements that are not included in Pl.l. 

2. Identification of Putative ds-regulatory Modules 

Potential cfj-regulatory elements of transcription can be 
identified as highly conserved non-coding regions in phylogenetic 
footprinting analyses (as an example see [35]). A fragment of 
120 kb harbouring most of the muscleblind gene, plus 20 kb 
upstream of the gene (complete sequence analyzed chr2R: 
13133058-13252891), was used as the reference DNA to align 
orthologous sequences from 12 Drosophilids. Using the bioinfor- 
matics tool rVista, an intronic sequence showing above 90% 
identity in a 100 bp window between Drosophila melanogaster and 
D.mojavensis was selected. This sequence, refered to as H region, 
was 872 bp long and was located in intron 2 (Fig. 2A and Table 1). 
Moreover, chromatin immunoprecipitation followed by micro- 
array analysis (ChlP-on-chip) data revealed putative cw-regulatory 
modules (CRMs) Ml, M2, M3 and ML (Fig. 2A and Table 1) in 
the muscleblind locus that bound Drosophila Mef2 [36,37,38]. We 
found these results particularly relevant because Mef2 is a known 
activator of muscleblind expression in the Drosophila embryo [2]. 
Interestingly, these candidate CRMs not only bind Mef2 in ChlP- 
on-chip experiments but also other muscle organizing factors such 
as Biniou (Bin), Tinman (Tin) or Twist (Table 1). The ML region 
only bound Mef2 in late embryos according to [37]. 

3. In vivo Testing of Reporter Expression Reveals Tissue- 
specific Enhancers 

To test the regulatory potential of the highly conserved (H) and 
the Mef2 -bound (Ml, M2, M3 and ML) genomic regions 
involving the muscleblind locus, we generated fusion constructs in 
the Drosophila transformation vector pH-Stinger. This vector 
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aagttgtcgaaagtcgttcgtgaaaattcaagaaaaatagtgcaagcgctgcagggaaattg 
tccggaaaaatcgcaaaatccattgtttttagccggcatcttaaaaaccaacgataacaaca 
gcttctgcaacaacgcgcgtgtttagttcacacacacaccacggagcgtcgagaaaaaaagg 
ttaaaaatacattttcaattcaccatcggcgtaaaagtaaaacaataaacaaatacaagaaa 
ataaaacagcagccgctttttcacccaacaacaacaatttcaacaacttcaacttcgccagc 
agcaacaacaacaagtcagctggacaaccgaag 



ttgcattctgcataaaaccataacacaaacgacttgcgcattactaacacctaa 
aatctaaaagatataaaacatgatcagtttccacatctctcacttatacctaga 
atcgctctgtctctctctctctctctctacgact^^^^^ctctgtcttacgcc 
atttctatttatctctctt ' aaccaagtcqgtatqatcctaaaqttc 
tatttctctcttttactct BWgBATAATGTAAACTCAGCTTACACACAAA 
ftAAGCAGTAA.A A A " A A AJ.TAC AAAAATTAACAGCTnCGTTTkVlllllltJlJJtMlllA T 
ATATATACfl: >Jllf»lll>Al :aTTTGAATAAGCGTTCGATTTGAAi " -TAG 
CCCAAGCTATCAGCTACCCCCACAAGCTTTGTACTCCACAAAAA':Gi:x-AA':GAG 
CAAAATCGATAAAGACCAGATAGCAAATGAAATCATACCA TCTGCc HllMMyillhllJ 
Ht GCTATAAACATATATAGGTGCTGATCTGTGCTCTTGAATTTAGTGTTTTTTG 
TGTGTGCGTGTCGCGAGAGG AGCACTCAAAACCAAAAAAAAAAAAAAAAACGTA 
TAGTAAAACAAAAAACTATCTCTAGTCGCTAGGCTATAACTATACAAACTCCAG 
CTATCCCAGATCCAGATCCGTAAACAGCAAATAGTTATATATCTATAACCCCAA 
AATACGATGGCCAACGTTGTCAATATGAACAGCCTGCTCAACGGCAAGGATTCG 
CGCTGGCTGCAATTGGAGGTCTGTCGCGAGTTCCAGCGCAACAAATGCTCGCGC 
CAGGACACCGAATGCAAGTTCGCCCATCCCCCGGCCAACGTGGAGGTCCAGAAC 
GGCAAGGTCACCGCCTGCTACGACAGCATCAAG 




Figure 1 . Organization of tlie muscleblind genomic region. (A) Representation of 90 kb of the Drosophila melanogaster muscleblind gene. 
Green boxes represent exons and black lines introns. Representation according to [5]. Candidate promoter regions PI and P2, and their shorter 
versions P1.1 and P2.1, are indicated. Black arrows denote putative transcription start sites located in exons 1 and 2. (B) 5'-Ends of ESTs mapping to 
the muscleblind gene, according to the UCSC Genome Browser, suggest that most transcripts start in exon 1 and 2. Genomic context around exon 1 
(C) and exon 2 (D). Exonic sequence is in capital letters, PI and P2 are highlighted in blue (GenBank accession numbers KJ398152 and KJ398154, 
respectively), and Pl.l and P2.1 are underlined (GenBank accession numbers KJ398151 and KJ398153, respectively). Blue boxes denote promoter 
consensus sequences with Sig value greater than 7 according to Scope (significant by default). (E) Relative luciferase activity from transiently 
transfected Drosophila S2 cells. Luciferase activity was stronger from PI (or P1.1) than from P2 (or P2.1) promoters. Luciferase activity was measured 
48 h after transfection. Reniila expression levels were used to normalize cell number, transfection efficiency, and general effects on gene 
transcription. All data were also normalized to luciferase levels of the empty vector pGL3 Basic. ***P<0.001. Bar graph shows means+s.e.m. from 
three independent experiments with three technical replicates each. 
doi:1 0.1 371 /journal.pone.00931 25.g001 



contains the heat shock protein 70 {Hsp70) promoter and is specifically 
designed to avoid chromatin configuration efTects ("position 
effects") by flanking the eGFP reporter cDNA with two copies 
of insulator sequences from the g^psy transposon [39] . Embryonic 
expression of the eGFP was assessed either directly (green 
fluorescence) (Fig. 2) or immunodetecting the reporter protein 
with a polyclonal anti-GFP antibody (Fig. 3). Only embryos 
carrying reporter constructs under the control of the M2 and ML 
candidate CRMs revealed consistent patterns (Fig. 2G—I; Fig. 3; 
and not shown) and in both cases eGFP expression was restricted 
to nuclei, as expected by the presence of a nuclear localization 
signal in the vector. 



We have prewously shown that Muscleblind is localized in the 
nuclei of embryonic pharyngeal, visceral and somatic muscles, in 
the larval photoreceptor system, and in repeated clusters of cells 
within the central nervous system [2,3] (Fig. 2B,Cj. In combination 
with the Hsp70 promoter, M2 drove robust eGFP expression in the 
somatic musculature of late embryos, approximately starting from 
stage 13. Notably, no eGFP expression was detected in other 
muscle derivatives or tissues where endogenous muscleblind is 
normally detected, particularly the CNS (Fig. 21). As control, we 
generated transgenics carrying a promoter-less M2:eGFP con- 
struct, which revealed no eGFP expression (Fig. 2D-F), thus 
confirming that M2 had no promoter activity by itself but requires 
the presence of a promoter to exert its enhancer activity. Similarly, 
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Figure 2. ME reproduces muscleblind expression in the embryonic somatic musculature. (A) Localization of the putative c/s-regulatory 
modules iVll, M2, ML, H and M3, indicated as orange boxes, in the context of the muscleblind genomic locus. Fluorescence confocal images of lateral 
(B,E,H,) and ventral (C,F,I,) views of late Drosophila embryos. (D,G) Schematic representation of the reporter constructs used to transform the germline 
of Drosophila. In control yw flies (B,C) an anti-MbI antibody detects robust expression in the somatic musculature and in the CNS (green). Direct 
visualization of the GFP reporter under the control of the ME enhancer in the pH-Stinger vector (E,F,H,I). Promoter-less ME constructs (D-F) do not 
activate GFP expression and serve as negative controls. Flies carrying the ME enhancer upstream of Hsp70 (G-l) reproduce Muscleblind expression in 
the somatic musculature but not in the CNS. All micrographs were taken at 200 x magnification. Anterior is to the left and dorsal up unless otherwise 
stated. 

doi:1 0.1 371 /journal.pone.00931 25.g002 



double eGFP and Mu.scleblind immunostaining of fly embryos 
carrying MLxGFP constructs revealed that ML drove expression 
to clusters of cells in the ventral cord of late embryos that 
overlapped with those expressing endogenous Muscleblind (Fig. 3). 
First signal started at developmental stage 12 and no eGFP 
expression was detected in tissues other than the CNS. Consis- 



tently, ML included predicted binding sites for factors involved in 
nervous system development such as Ladybird early (Lbe), 
Ladybird late (Lbl), Kriippel (Kr) and Hunchback (Hb) 
[40,41,42,43] (Fig. 4G). Li summary, these results support that 
M2 and ML are somatic muscle and CNS-specific enhancers of 
muscleblind, respectively, at least during embryonic development. 



Table 1. Putative c/s-regulatory modules (CRMs) chosen for in vivo testing of reporter expression. 



CRM 



Location 



Size (bp) 



TF binding sites 



M1 



M2 



M3 



ML 



Upstream 



Intron 2 



Exon 4 



2278 



3340 



1240 



1 Twi 

6 Mef2 

1 Bin 

2 Mef2 

3 Bin 

3 Mef2 
1 Bin 
3 Mef2 
1 Bin 
0 



All except region H according to [37]. 
doi:10.1371/journal.pone.0093125.t001 
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Figure 3. NE reproduces musdeblind expression in the central nervous system. (A) Schematic representation of the reporter construct used 
to transform the Drosophila germline. Fluorescence confocal images of lateral (B-D) and ventral (E-J) views of late Drosophila embryos expressing 
construct (A) co-stained with anti-GFP (green) and anti-iVlbl antibodies (red). NE drives expression in the CNS (arrows; C,l) overlapping Muscleblind 
expression (B,E,H; D,G,J shows the merge in yellow). No signal of the reporter was observed in tissues other than CNS. Endogenous Muscleblind 
expression in the muscles is in focus in (E,G,H,J). Micrographs were taken at 200 x (B-G) and 400 x magnification (H-J). Anterior is to the left and 
dorsal up unless otherwise stated. 
doi:1 0.1 371 /journal.pone.00931 25.g003 



We therefore renamed these candidate CRMs as "ME" (from 
muscle enhancer) and "NE" (from nervous system enhancer). 
Importantly, both ME and NE were capable of activating a 
heterologous promoter, a typical ability of transcriptional enhanc- 
er elements. 

4. Characterization of the Muscle and Nervous Enhancer 
Elements of muscleblind 

To test the ability of ME and NE genomic regions to enhance 
transcription from the putative musckhlind promoters, we used 
luciferase reporter assays in Drosophila S2 cells. We used S2 cells 
because this is a well characterized cell line whereas muscle or 
neuron-specific cell lines were not immediately available. Both 
enhancer regions were cloned upstream to each of the putative 
promoters PI, Pl.l, P2 and P2.1 in their forward (Fig. 4A) and 
reverse (not shown) orientations in the pGL3 basic vector, which 
carries the Firefly luciferase reporter. As controls, ME and NE 
were tested in promoter-less constructs and no luciferase activity 
was detected (data not shown), thus confirming that the enhancer 
regions do not have any transcriptional activity by themselves. 
When promoters were in the forward orientation we observed that 
the transcription originated from PI and Pl.l was strongly 
enhanced by ME, whereas NE had no effect on Pl.l, or even 
decreased transcription when in combination with PI (Fig. 4:B). 
Similarly, P2 and P2.1 promoter activity was significantly 
enhanced by NE (around 30% and 20%, respectively), but 
remained unchanged when in combination with ME that even 
repressed transcription from P2.1 (Fig. 4C). As control of promoter 
directionality, luciferase levels of constructs carrying promoters in 
their reverse orientation were measured. Consistently, relative 
luciferase readings dropped to close to background levels in 



constructs containing promoters in their reverse orientation (Fig. 4 
compare i?,C' with D), although both ME and NE still managed to 
significantly potentiate transcription from PI and Pl.l. Thus, 
promoter activity is orientation-dependent, as reversed promoters 
expressed significantly less luciferase reporter, and the activity of 
the ME and NE enhancers on PI and P2 suggests enhancer- 
promoter communication specificity. 

ME function was further analyzed to narrow down sequences 
necessary for enhancer activity. This involved testing three smaller 
regions, approximately 1 kb each, here referred to as ME.l, ME. 2 
and ME. 3 according to their relative position in the original region 
(Fig. 4/1). These sequences were cloned upstream to the PI and 
Pl.l promoters and were used in luciferase reporter assays in S2 
cells (Fig. 4£). Compared to the luciferase activity of the promoter 
alone, ME. 3 was the only subregion able to significantly increase 
transcription from Pl.l, also showing the same trend on the P 1 
promoter. Notably, all other subregions tested either did not boost 
expression from PI or Pl.l or even inhibited it. Therefore, these 
data, and bioinformatics analyses support that ME. 3 contains 
sequence motifs necessary for ME enhancing activity, including 
consensus Mef2 binding sites (Fig. 4F), but they also suggest that 
for maximum enhancing activity all three subregions are required. 
Bioinformatics analyses in NE found an enrichment of targets for 
nervous system transcription factors (Fig. 4G). 

5. Functional Conservation of Human MBNL1 Promoter 

Sequence conservation between Drosophila and human MBNLl 
promoter sequences was patently non-existent. However, analysis 
of available cDNA sequences and ESTs in the MBMLl locus 
suggested that human MBNLl might also use TSS located in exon 
1 and in exon 2 (Fig. bA). Consistently, putative TSS includes 
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CATTTCTTGCTTTGTTTGCTCATCGCGCCCGATGAAAATCAAATGAAGCGAGCGAGCAA 
GCGCCTTGCGGAGCGGAGTG AAAAATG CTAATCAACAATGGAATCG ATAAATG TTTGAA 
TTGGGTTCGGCCTCTACCCCTCAGTGATCTCCTTCACTCCTTATCTAAATGCCAAAACT 
GAACATTGACGACCAGGAATTCTCGTACTGTCGCTTACACATCGAATTT GCATATTATG 
AGATAGATAAAGAAGTGGAGGAGGTGCAGTTGCCAGTTCATTGGAATTTGTGTTTGGCA 
AAGG AAATGAAA TATTTTCAATGAAT AAAACTAAAC GCTGCCCAGGGAGTTGACAAAAT 
GCTAAAGCGAGACACAAAATTAAATTACTTTAAGCATGAACACCCAGCGCGTTTCGTGT 
TTTAATCACTTTCAAAGAAGCATGAATAGCTTCTTTAAAAAATATATATATATTATTTT 
ATTGAGTGACTGTTTAAGACCAAACAATTTGATGT TATTTTTCGA ATTATCCGTTTATC 
AGTTTACGTTTATAGTATAATATATTATATTATAACCTTTGAAATCAATACTGATATTT 
ATTTGCCTTATTGAGCGGTAATAGATATCGCAAATAAATCGCTCTGTCCCGGAAGCGAA 
TTTTCTCATTTTCCCCCCCGAAAAGAGTCCAGCAATCAATGGACTACTTCCGTCATGTT 
GCCTGAGTTTTCAAACACATGCGTTCGCACGCGAAATCAATATCGAAAGCCGAAGCTTA 
GCAACTCCCGCTCCTCCGAAACTCCGACTCCTCGAAAACTGACAGCAC AATTTAA GTCG 
CCTGGC CAAAATCAACAAATA TCTTTCTACTGGCATATACGAACAACCATTTTTTGTTG 
GTTGTGCGAAT AGAAATTGAAAATGCGC AAAGCGCCTGTTTAAGCATTTCAATTGATAT 
CCATCAGCAAAGTGGCGCAACAGCAGCAGCAGCAACAACAACAAAGGCCGTGCTCTGAG 
GGGG AAAAATAGAAAA TCAATTCAATCACGCTTATCAGAGACTGAAGAAGAGCGAGAGA 
GCTCATCGAGAAACGTCATCGGCAGAATCTGAATCAGTACCAGAATCAGAATCAGTAC 



TCGCAACAGAATATGCCTGACGT CCATTA AATGTGATGTTTTrG TTrTTTA flArCCATG 
AAACCTTTATTGATGCATTGCATAAAGMTAACACGCTTAACCAGCAGTTGATGTCTGC 
TCT CAATTA AATTGCTyVGCTGCACATATGAA TATTTAT^ATA ATT TAT^ATA CAATCTCCG 
CTTTGCCGTCGTGTCCTGTCAATGTTTTAAAGCGTACGCAATATCCTTTCGCTGCCTCC 
TGCATATCCCTTTACTTCCTGCCACGCCTACTGCGTTAGTGGGCCTCTGGAAAATGCTG 
CAACACCCGCATGTTTACCCCCGGAAAGTATGCAACACTTTAAAGCTCTTTTGCGGTTG 
CTGCTTTGGCCACCAAAAGCAAAAGCAACAACAGCATTTTCCGACTACTGTGCGTTTTT 
^rTGCGTTTTrCATGCGTCCCATGTGGTTTATTACAAATTGATTTTATTATTGTTTTCA 
CCCGTACCCTCCTTTTGACCTGTAGGCCCTGCTTTGTTTGCTGCTGGTTTTTCCGGCTG 
CTGGCGATTTGAAAACTGATTGCAAACAC ATGATTA TCCAACCACCCCCCAGCACCTAC 
TCTGCATATTTCTCCATTTGAGTGCACAAAGCCACCCGTCGAT TGTTTAATTAA TTTTA 
TAATGAGTTAArTTTACGCTGGCCAAACGGGAATTTAACATTTTCCATTCGGCTTTTGG 
TCAACGTTTTTGCATTTCATTCGAAATTGAAGTCGAATGGCGAATTCAATTATAAATAT 
GCTTACATTTGCATCTGTCAATTTAGTTGCAAATGTTGTGTTGCCGATGGGATAGGC 



Figure 4. WIE and NE boost PI and P2 promoter activity, respectively. (A) Schematic representation of the Firefly luciferase reporter 
constructs used to transiently transfect Drosophila S2 cells. (B) ME, but not NE, potentiates PI or P1.1 promoter activity. Basal promoter activity of PI 
is significantly higher than P1.1 in these assays. Conversely, NE, but not IVIE, weakly enhanced P2 or P2.1 promoter function (C), being the relative 
luciferase activity measured approximately one tenth of that from PI or P1.1. (D) Reversed PI and P1.1 promoters still responded to ME and NE, 
although at much lower levels, whereas P2 and P2.1 did not significantly change reporter expression. (E) ME (GenBank accession number KJ201027) 
was subdivided into three smaller regions of approximately 1 kb (ME.1, ME.2 and ME.3). ME.3 retained most of ME ability to boost expression, 
although only for P1.1 it reached statistical significance. ME.3 and NE (GenBank accession number KJ201028) sequences are enriched in consensus 
binding sites for Mef2 (underlined) (F) and nervous system (G) transcription factors according to the following code: Hunchback sites, underlined; 
Ladybird early and Ladybird late, bold; Kruppel, italics respectively. Predictions used the Jaspar and rVista programs. 
doi:1 0.1 371 /journal.pone.00931 25.g004 



promoter marks such as GpG islands and histone modification 
tracks (Fig. 5A,B). We defined 500 bp around the predicted start 
region from both exons to test them as putative human MBNLl 
promoters; Hsa-Pl in exon I (chr3: I5I985544-15I985045) and 
Hsa-P2 in exon 2 (chr3: 152016823-152017382). Synthetic Hsa- 
Pl and Hsa-P2 sequences were designed to replace the Hsp70 
promoter in the Drosophila transformation pH-Stinger vector. No 
eGFP expression was observed in transgenic fly embryos carrying 
any of the human promoters alone (data not shown and Fig. 5D- 
Fj. However, we observed a robust expression of eGFP in the 



somatic musculature of embryos when ME drove expression of the 
Hsa-Pl promoter (Fig. 5G—I compare to Fig. 2B,C). This 
expression was not observed in similar reporter constructs where 
Hsa-P2 replaced Hsa-Pl (ME-Hsa-P2; not shown). These data 
support that Hsa-Pl can initiate transcription, as we have also 
demonstrated for the musclehbid PI promoter, and that ME is a 
muscle enhancer on a variety of promoters. 
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Figure 5. Genomic organization of the human MBNL1 gene. (A) Scale representation of 198 kb of the human MBNL1 locus. Green boxes 
correspond to exons and black lines to introns. Tested promoter regions are indicated as PI and P2; a yellow circle denotes a predicted CpG island. (B) 
Schematic representation of H3K27Ac marks, typical of promoter regions, on seven human cell lines. (C) 5'-Ends of ESTs mapping to the MBNL1 locus 
support two potential transcription start sites for the gene. Data according to the UCSC Genome Browser [58]. (E,F,H,I) Direct visualization of the eGFP 
reporter under the control of the ME enhancer. (D-F) Enhancer-less Hsa-PI construct is a negative control. (G-l) Flies carrying the ME enhancer 
upstream of the human MBNL1 promoter (Hsa-PI) reproduce Muscleblind expression in the somatic musculature. (E,H) Lateral and (F,l) ventral views 
of late embryos. All micrographs were taken at 200 x magnification. Anterior is to the left and dorsal up, unless otherwise stated. 
doi:1 0.1 371 /journal.pone.00931 25.g005 



Discussion 

Muscleblind ortliologues have attracted intense research interest 
due to their important role in vertebrate muscle development as 
well as involvement in several degenerative RNA-mediated 
diseases including DM1 and DM2, Huntingtons disease, Hun- 
tington's disease-like (HDL2) or spinocerebellar ataxia 8 (SCA8) 
[6,44,45,46]. More recently, MBNL proteins were found to 
repress embryonic stem cell alternative splicing patterns, uncov- 
ering an additional role in the control of the cell pluripotency [22]. 
Despite this, little is known about the transcriptional regulation of 
muscleblind genes both in Drosophila and in vertebrates. 

As a means of dissecting the cif-regulation of muscleblind, we 
analyzed a genomic DNA fragment harboring the muscleblind locus 
and its upstream region, looking for CRMs that regulate basal 
initiation of transcription (promoters) and tissue-specific expression 
(enhancer,s). In silico and in vivo studies identified two putative 



promoters, PI in exon 1 and P2 in exon 2, and two intronic tissue- 
specific regulatory elements, a region of 3340 bp which drives 
specific expression in somatic muscle (ME) and a region of 830 bp 
which drives expression in central nervous system (NE). Both 
enhancers had been selected because of their enrichment in Mef2 
binding sites according to ChlP-on-chip data [37], because Mef2 
is a known positive regulator of muscleblind in the embryo [2]. 
Nevertheless, other enhancer elements must exist in order to 
explain the rich embryonic expression pattern of muscleblind, which 
also includes expression in visceral and pharyngeal musculature, 
the Bolwigs organ (the larval photoreceptor system) and the 
imaginal discs [2,3]. Regarding this, putative CRMs that did not 
reproduce any embryonic pattern in our study can not be 
discarded as functional in other developmental stages. 

Enhancer regions ME and NE were able to boost expression 
originating from the heterologous Hsp70 promoter in transgenic 
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embryos. However, in luciferase S2 assays, ME preferentially 
activated the PI promoter while NE showed preference for P2, 
although the enhancer activity of NE was smaller than that of ME 
and transcription arising from P2 was on average one tenth of that 
from PI (Fig. 4i?,C). Our results in transgenic flies carrying the ME 
in combination with the human MBMLl P 1 promoter also suggest 
enhancer promoter specificity as ME only induced reporter 
expression from Hsa-Pl, but not from Hsa-P2. The use of 
alternative promoters is a known mechanism of transcriptional 
regulation, which has been reported to influence levels of 
transcription, turnover or translation efficiency of mRNA isoforms 
with different leader exons, tissue specificity [47] and to generate 
protein isoforms differing in the amino termini (reviewed in [48]). 
Furthermore, different core promoters have been found to possess 
distinct regulatory activities driven by the same enhancer in the 
Drosophila embryo ([49]). In muscleblind, the potential in vivo use of 
PI and P2 as alternative promoters would have no consequences 
as for the encoded protein since the start codon is located in exon 
2, which is downstream of both. However, the in vivo relevance of 
the internal promoter P2 remains to be specifically addressed. It is 
also worth mentioning that whereas the identified enhancers 
provided strong activation of the reporter in vivo, particularly ME, 
the measured activation in S2 cells was discrete, reaching some 6- 
fold increase over the promoter alone condition (Fig. 41}). This 
may stem from the particular combination of transcription factors 
that S2 cells express, which may not be particularly favourable to 
activate myogenic enhancers. Mef2, for example, is weakly 
expressed in S2 cells according to modENCODE data \5(J\, 
and, consistently, Muscleblind is only barely detectable in this cell 
line [51]. Nevertheless, the low expression of Muscleblind in S2 
cells offers an opportunity to test the activating potential of 
candidate regulatory transcription factors. 

Despite that sequences homologous to ME or NE in human 
MBMLl are not obvious, ENCODE Chip-seq data confirms that 
there is a high concentration of transcription factor binding sites in 
the first intron of MBMLl, including multiple MEF2A and 
MEF2C binding regions, thus suggesting that MEF2, a central 
regulator of diverse developmental programs [52], is also involved 
in the regulation of human MBMLl transcription. Indeed, detailed 
information on the multiple transcription factors converging on 
the Drosophila ME and NE enhancer elements would help in the 
identification of the functionally equivalent enhancer regions that 
integrate inputs from the same factors in humans. Although 
hypothetical, the functional conservation among fly and human 
muscleblind enhancers is conceivable. Deformed enhancers, for 
example, drive meaningful spatial expression patterns in vivo in a 
mouse context [53]. In any event, this initial characterization of 
cit-regulatory regions is the first step towards the understanding of 
the transcriptional regulation of muscleblind. 

Materials and Methods 

Constructs 

PI and P2, and their shorter versions Pl.l and P2.1, promoter 
regions were synthesized by GenScript with Mhe\/Xho\ terminal 
adapters and were provided cloned into the pUC57 vector. High 
fidelity PGR (KAPA HiFi DNA Polymerase, KAPA biosytems) 
was used to subclone into the pGLS-Basic vector (Promega) 
previously linearized with the same enzymes. ME, ME.l, ME. 2, 
ME.3 (spanning from 13170622-13171543, ME.l; 13171544- 
13172759, ME.2; 13172760-13173908, ME.3) and NE were 
obtained from Drosophila genomic DNA by high fidelity PGR and 
were cloned into Kpnl/Sacl digested pGLS-Basic vectors already 
including different promoter regions. To generate transgenic flies, 



Ml, M2, M3, H and ML fragments were PGR amplified from 
genomic DNA and were cloned into the BgBl and Xbal sites of the 
Drosophila expression vector pH-Stinger [54]. Synthetic Hsa-Pl 
and Hsa-P2 promoter regions, including PsH, BglU., Xhol (5') and 
Hindlll, Pstl (3') adapter sites, were cloned into the /)?7C57 vector 
and subsequently transferred to the Pstl site of the pH-Stinger 
vector. To generate ME-Hsa-Pl and ME-Hsa-P2 constructs, M2 
was amplified with specific oligos containing BgHl/Xhol adapters, 
digested, and cloned into the corresponding sites of the pH-Stinger 
vector already containing the candidate human promoters. All 
constructs were confirmed by sequencing. Description of used 
primers is in Table 2. 

Cell Culture and Dual Luciferase Assays 

Drosophila melanogaster Schneider 2 cells (S2) were cultured at 
27°G in growing media containing 90% Schneiders insect media 
(Gibco), 10% heat inactivated fetal bovine serum (FBS), 100 units/ 
ml of penicillin and 100 mg/ml of streptomycin. 48 h before 
transfection, 1 0^ log-phase cells were transferred onto 24-weU 
plates (300 |il per well). 4 |il of Gellfectin reagent (Iiivitrogen) in 
200 |il of serum and antibiotic-free medium were used to co- 
transfect 450 ng of the pGL3 reporter plasmid of interest and 
25 ng of Renilla luciferase. A GFP expressing vector served as 
transfection efficiency control. Cells were incubated 16 h with 
transfection mix, then media was replaced by Schneiders complete 
medium and the culture was additionally maintained 24 h at 
27°G. Luciferase expression was monitorized using the Dual- 
Luciferase Reporter Assay System (Promega). Briefly, this involved 
adding 100 [il of lysis buffer per well, shaking for 15 min and 
transferring the lysate to a white 96-well plate with 40 \A of 
luciferase substrate. After 10 s of luminescence detection, 
Stop&Glo buffer was added and luminescence measurement was 
repeated. Luminescence readings used an En Vision plate reader 
(PerkinElmer). In cell culture luciferase reporter assays, all graphs 
show the average of three independent experiments with three 
technical replicates each. P-values were obtained using a two- 
tailed, non-paired t-test (a = 0.05). Welch's correction was applied 
when variances were sigiiificaiidy different. 

Immunohistochemistry of Drosophila Embryos 

Embryos were fixed for 20 min using 4% paraformaldehyde in 
PBS, devitelinized with a heptaneimethanol 1:1 mixture, and 
blocked with 0.1% Triton-X-100 in PBS (PBT) with 1% BSA for 
15 min and later with 2% BSA in PBT for 30 min. Subsequentiy 
they were incubated with rabbit anti-GFP (1:200 Torrey Pines 
Biolabs) antibody diluted in blocking solution containing 1% 
donkey serum for 2 h at room temperature. After washes embryos 
were incubated with an anti-rabbit-FITG (1:200 Galbiochem) 
secondary antibody for 45 min. Muscleblind detection used sheep 
anti-Muscleblind (1:500 [55]) for 2 h followed by washes and 
primary antibody recognition with sheep biotin-conjugated 
secondary antibody (1:100, Sigma) for 2 h. Then, washed embryos 
were incubated with ABC solution (ABC kit, VECTASTAIN) for 
30 min at room temperature, and were washed and incubated 
with streptavidin-Texas Red (1:1000, Vector) for 45 min. In all 
cases embryos were washed 3 x with 1 % BSA in PBT and were 
mounted in Vectashield (Vector) with 2 |ig/ml DAPI. Images 
were taken on an Olympus FluoView FVlOO confocal microscope. 
At least 10-15 embryos of the desired stage, and showing the 
relevant expression patterns, were analyzed. 

Drosophila Strains and Transgenics 

Genomic DNA was extracted from the Drosophila genome 
project strain y' ; Gr22b' Gr22d' cn CG33964^'''' bw' sp' ; LysC' 
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Table 2. Sequence and melting temperature (Tm) of PCR primers used for cloning genomic DNA. 



Primer pair 


Forward 5 — >3' 


Tm 


Reverse 5 — >3' 


Tm 


IVI1 pH-Stinger 


GAAGATCTCGGTTCGACGTGACTTTTGC 


70,1 


GCTCTAGATAATAATAGATAATAAATATGG 


52.2 


IV12 pH-Stinger 


GAAGATCTCTCGACAAACAATTGTCAG 


66.8 


GCTCTAGATGCCGATGACGTTTCAAT 


70.6 


IVI3 pH-Stinger 


GAAGATCTTGAGTTTCGGTCTGATGCC 


66.9 


GCTCTAGAATGCATTTCAAATTTTGG 


62.0 


H pH-Stinger 


GAAGATCTACTCTTTGTCTATTTTCAC 


52.3 


GCTCTAGACATGCCATTGTGGAAAAGC 


67.5 


IVIL pH-Stinger 


GAAGATCTTAACCAGCAGTTGATGTCT 


65.5 


GCTCTAGATTTGCCTATCCCATCGGCA 


74.2 


PI pGL3-Basic 


ATTGCTAGCTTGAAGTTACTAAA 


55.9 


ATTCTCGAGGAACGACTTTCGACA 


68.9 


P1.1 pGL3-Basic 


ATTGCTAGCGCAGAGAGTGAAAGA 


67.2 


ATTCTCGAGGTGACT GAGCAGAA 


66.3 


P2 pGLB-Basic 


ATTGCTAGCTTGCATTCTGCATAA 


65.5 


ATTCTCGAGGGTTTTGAGTGCTCC 


69.3 


P2.1 pGL3-Basic 


ATTGCTCGCATTTCTATTTATCTC 


56.6 


ATTCTCGAGCCTCTCGCGACAC 


71.0 


IVIE pGLB-Basic 


TATGGTACCCTCGACAAACAATTGTCAG 


69.9 


ATTGAGCTCTGCCGATGACGTTTCAAT 


73.4 


NE pGLB-Basic 


TATGGTACCTAACCAGCAGTTGATGTCT 


67.2 


ATTGAGCTCTTTGCCTATCCCATCGGCA 


76.7 


IV1E.1 pGLB-Basic 


TATGGTACCCTCGACAAACAATTGTCAG 


69.9 


ATTGAGCTCTAACGGTCGGCAAAGG 


72.3 


IVIE.2 pGLB-Basic 


TATGGTACCGTGTGCTGACGTCGTTAGG 


74.1 


ATTGAGCTCAAAGGCACAGGGTCC 


72.0 


IVIE.3 pGLB-Basic 


TATGGTACCGGTTGACAAACGATTCGG 


75.7 


ATTGAGCTCTGCCGATGACGTTTCAAT 


73.4 



Restriction sites are highlighted in bold. 
doi:10.1371/journal.pone.0093125.t002 



MstProx' GstDS' Rh6' (Bloomington Drosophila Stock Center, 
Bloomington IN), y'w'"^ flies were also from Bloomington IN. 
All constructs were injected into w'"^ embryos by BestGene, 
typically resulting in 2-6 independent transgenic lines. 

Web Resources 

Predictions of transcription factor binding sites used JASPAR 
[56]. Phylogenetic conservation employed the Whole Genome 
rVISTA browser [57]. EST mapping was according to the UCSC 
Genome Browser database [58]. Predicted promoter consensus 
sequences used Suite for Computational Identification of Promot- 
er Elements (SCOPE) [34]. Genomes of reference were Drosophila 
BDGP R5/dm3 and human GRCh37releases. 
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